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ABSTRACT 

This study provides an overview of class size research, examples of 
various class size and pupil-teacher-ratio (PTR) configurations commonly used by 
practitioners, and the most recent findings of scientifically controlled experimental 
Tennessee STAR studies. The learning environment is hierarchical in nature, with 
student-level data influenced by family characteristics external to school; school- 
level data influenced by school faculty, staff, and resources; and district-level data 
influenced by the schools and the community of which they are a part. This paper 
examines the results of two- and three-level class size models that have been able to 
account for more variability than one-level models and that thus have provided the 
most comprehensive interpretations available of the effects of small classes. Results 
have indicated that smaller classes have been effective with diverse populations of 
students but most effective in the earlier grades (K-3) with minorities or students 
living in poverty. The paper concludes that both PTR and class size provide important 
information for decision makers, that funding sources affect class size decisions, and 
that the class size configuration may often be determined by scheduling considerations 
and by availability of faculty and facilities. (Contains 18 references.) (SM) 



Reproductions supplied by EDRS are the best that can be made 
from the original document. 



482 693 



RUNNING HEAD: COMPILATION OF CLASS SIZE FINDINGS 



Compilation of Class Size Findings: 
Grade Level, School, and District 



Marie Miller- Whitehead 
Tennessee Valley Educators for Excellence 
TVEE 



a 



Invited Symposium on Class Size Reduction 
Organized by 
C. M. Achilles 

Seton Hall and Eastern Michigan University 



with 



PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL HAS 
BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 



Jean Krieger 

Woodlake Elementary School 
Mark Sharp 

Eastern Michigan University 

Paula Egelson 
SERVE 



U.S. DEPARTMENT OF EDUCATION 
Office of Educational Research and ImprOvennent 

EDUCATIONAL RESOURCES INFORMATION 
CENTER (ERIC) 

□ This document has been reproduced as 
received from the person or organization 
originating it. 

□ Minor changes have been made to 
improve reproduction quality. 



Points of view or opinions stated in this 
document do not necessarily represent 
official OERI position or policy. 



Paper presented at the Annual Meeting of the Mid-South Educational Research Association 

Biloxi, MS 

November 5, 2003 






:r 

o 

P 

Vi 



ERIC 



2 



BEST COPY AVAILABLE 



Compilation of Class Size Findings: Grade Level, School, and District 



Introduction 

Class size studies have spanned several decades, backed by empirical evidence gathered 
and presented by proponents of the most widely implemented class size models. Perhaps most 
important and most likely to affect results in both the various analyses and in the interpretation of 
findings has been to differentiate carefully between the terms “class size” and “pupil-teacher ratio” 
or PTR. Misconceptions among not only the general public but also among many educators 
about these differences have been the source of much error, and have undoubtedly resulted in 
studies that either failed to find differences that existed between groups or that erroneously 
reported differences between groups. Therefore, the present study provides an overview of class 
size research, examples of various class size and PTR configurations commonly used by 
practitioners, and the most recent findings of scientifically controlled experimental STAR studies. 
The learning environment is hierarchical in nature, with student- level data influenced by family 
characteristics external to school; school-level data influenced by school faculty, staff, and 
resources; and district-level data influenced by the schools and the community of which they are a 
part. This paper examines the results of two- and three-level class size models that have been able 
to account for more variability than one-level models and that thus have provided the most 
comprehensive interpretations available of the effects of small classes. Results have indicated that 
smaller classes have been effective with diverse populations of students, but most effective in the 
earlier grades (K-3) with minorities or students living in poverty. 
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Review of Related Literature 

First and most importantly is the differentiation of class size and PTR, or pupil-teacher 
ratio. Many school and school district reports regularly include as an indicator of the district’s 
profile a pupil-teacher ratio rather than providing information about class size, or the number of 
students in each teacher’s classroom each day. A low pupil- teacher ratio may be indicative of the 
availability on staff of various professionals and paraprofessionals who serve as support staff but 
who do not regularly teach children in a classroom setting. Pupil-teacher ratio figures may 
include guidance counselors, aides, psychologists, social workers, special education teachers, 
media specialists, music and art teachers shared among several schools, program coordinators, 
school administrators, the school nurse, and the school nutrition program, all very necessary to 
the efficient operation of the school but not necessarily in direct contact with the same children 
each day for the purposes of teaching. 

Class size has been defined as “the number of youngsters who regularly appear in a 
teacher’s classroom and for whom that teacher is primarily responsible and accountable” 
(Achilles, 1999, p.l4). It is therefore possible and often the case that children in schools with 
relatively low PTRs of 15:1 may be taught each day in classes of as many as 25 or more students 
(Achilles & Finn, 1999; Achilles, Finn, & Pate-Bain, 2002). In fact, the correlation between 
district PTR and class size for 1998 Tennessee data was .28 (p < .01), with PTR accounting for 
1% of between-district differences in student science achievement (Miller- Whitehead, 2002a). 
However, the percentage of Teimessee schools within a district that were at or below state class 
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size requirements was positively correlated to science achievement at Grade 5 (Miller- Whitehead, 
2002b). 

The Tennessee STAR class size studies have examined three classroom configurations: 
small classes of 13 to 17 students (S), larger classes of 22 to 26 students or more (R), and larger 
classes with a fiill-time teacher’s aide (RA). The carefiilly controlled requirements of the STAR 
longitudinal project assured that many of the same students would be enrolled in small classes for 
several years (K-3 or 4) and their progress evaluated as they moved through middle and high 
school. According to a recent followup study of these students five years after their participation 
in the STAR small class project, effects ranged fi-om .30 (science) to .37 (math) for students who 
had been in small classes in each grade (Nye, Hedges, & Konstantopoulos, 1 999). In their 2002 
analysis comparing the achievement of low and high achieving students in small classes, the same 
team of researchers foxmd that (a) small class effects were greater in reading than math for low 
achieving students, (b) small classes were effective for both high and low achieving students, and 
(c) effect sizes for males in small classes were larger than for females in small classes (Nye, 
Hedges, & Konstantopolous, 2002). 

However, many schools and school districts use specialists who work with small pull-out 
groups or who work directly with students in a one-on-one setting for tutoring in reading and 
math, most often during regular school hours but sometimes in after-school or summer programs. 
Teachers in larger classes of 22 to 26 students may have a teacher’s aide with a teaching degree, 
who regularly teaches a small group of children, often in a separate room or pod such as a media 
center. According to Slavin (1994), multiple studies have foxmd that the difference in reading 
achievement of students in regular classes compared to those in regular-sized classes with 
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teachers’ aides have ranged from zero to very small. However, reading achievement of children 
at-risk in regular classes who received one-on-one tutoring improved. Therefore, according to 
some researchers, the effect on student achievement of the regular-size class with a teacher’s aide 
was largely dependent upon the extent to which the aide’s duties were teacher-focused or child- 
focused. Re-analysis of the RA configiuation from STAR data has provided little evidence that 
teachers’ aides whose duties were primarily custodial or who spent most of their time assisting 
with paperwork had a positive effect on student achievement (Gerber, Finn, Achilles, & Boyd- 
Zaharias, 2001). 

As a result of the wide variety of programs and models in the schools, researchers have 
either designed experimental studies in which class size has been carefiilly defined and controlled 
(such as the STAR study) or they have developed sophisticated class observational coding 
methodologies to organize existing data in a meaningfiil way for meta-analysis (Glass, McGaw, & 
Smith, 1981). However, the coding of observational data requires training and consensus on the 
part of researchers so that each observer codes variables such as class size in the same way for 
each record or observation. Data drawn from existing records or reports may need to be recoded. 
For example, when two grade levels or subjects are taught concurrently in the same room by the 
same teacher, it may appear on a preliminary state report as two separate classes of 17 or 16 
students each, rather than as one class of well over 30 students. Not imcommon when 
administrators on tight budgets attempt to provide advanced classes for high school students, 
classroom configurations such as this may mislead researchers who believe they have data for two 
small classes rather than for one large class. This kind of classroom configuration would certainly 
have an effect on time devoted to individualized instruction as well as on student achievement. 
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For this reason, NCES defines class size differently for teachers teaching in single self-contained 
classes than for teachers in multiple departmentalized classes (i.e., team teaching or inclusion 
configurations), where class size is computed as the average number of students in a teacher’s 
class (McLaughlin & Drori, 2000). 

A recent independent study conducted with NAEP data (Grissmer, Flanagan, Kawata, & 
Williamson, 2000) verified the efficacy of small classes for student achievement in reading and 
math. From a hierarchical perspective, the Grissmer et al. study foimd that the majority of 
explained variation in student-level data, i.e., test scores, could be identified by family and 
demographic characteristics such as parent education level, femily income, and ethnicity. A 
confoimding issue for the STAR study was that the percentage of minority students in the sample 
was greater than in the general population of Tennessee school children, although assignment was 
random wit hin schools and districts that participated in the study. However, Grissmer et al. 
concluded that small classes not only improved achievement for some groups of students, 
particularly minorities, but they also were more cost effective than regular classes with teachers’ 
aides. 

Hanushek (1999) conducted an independent validation of the Tennessee STAR project 
finding that children in the STAR small class (S) configuration performed .12 standard deviations 
above the kindergarten mean in both math and reading, while children in kindergarten classes of 
22-25 achieved .05 standard deviations below the mean, a difference of .17 standard deviations 
between these two class size configurations. Hanushek’s study concluded that the positive effects 
of small classes were limited to kindergarten and first grade, with later student achievement an 
artifact of the positive effect in kindergarten. Finn and Ac hill es (1999) foimd that kindergarten 
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students in small classes were about a month ahead of children in regular kindergarten classes, and 
by the end of the fifth grade were about half a year ahead of their peers in regular classes in aU 
subjects. Like Slavin and others, Hanushek found no evidence that student achievement was 
affected by the presence or absence of a fuU-time teacher’s aide in the classroom. Hanushek also 
claimed that variations in teacher quality across studies accoimted for greater effects on student 
achievement than did class size, a conclusion shared by other researchers who have found strong 
evidence of the “teacher effect” on student achievement, after controlling for other variables. 
However, although Hanushek found that teacher quahty had a far stronger effect on student 
achievement than class size, his study did not address other benefits in social capital, such as 
improved attitudes towards school, graduation rates, discipline, and the like. 

Although student achievement is a primary consideration for many policymakers, many of 
the benefits that accrue fi’om smaU classes have an indirect effect on student achievement. In 
addition to improving reading and overaU academic achievement in school districts in several 
states, smaU classes helped improve teacher morale, reduced discipline problems, reduced the 
need for remediation through early identification and prevention of problems, improved 
graduation rates, lowered dropout rates, decreased teen pregnancy rates, and were more likely to 
produce graduates with advanced or honors diplomas as weU as students who took the ACT and 
SAT coUege entrance exams or who planned to attend coUege (Cohen, Miller, StonehiU, & 
Geddes, 2000; Egelson, Harman, Hood, & Achilles, 2002; Krueger & Whitmore, 1998). 

As an example at the school level, a study conducted by NCES using data fi-om the 1993- 
1994 School and Staffing Survey reported standardized gamma coefficients of -0.25, -0.38*, and 
-0.36* for the effect of larger classes on student achievement in elementary, middle, and 
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secondary schools, respectively, and coefficients of -0.12, -0.18, and -0.00 for the effects of class 
size on school climate in elementary, middle, and secondary schools. The study also found that 
larger schools were associated with larger classes, with standardized gamma coefficients of 0.47*, 
0.57*, and 0.63*, respectively. At the national level, the study found that the largest between- 
state variations were for student achievement, school sizes, class sizes, and percentages of 
minorities and English Language Learners. As a correlate of achievement at the state level, class 
size (.89) was exceeded only by language barriers (.96) and minority percentage (.93) 

(McLaughlin & Drori, 2000). Of these three variables, class size is the one condition that a school 
or district administrator can actively control and vary to assure that all students receive adequate 
opportunities for learning. In effect, these results point to the diversity of students from state to 
state and to the differences between states in maximum allowable class sizes. 

A Small Class Example Using Grade 5 Science Achievement 
The input-process-outcome model has been widely used to describe social systems, 
although it assumes that these are invariant across time and does not take into account growth 
that may occur as a result of systemic change. Results of state-mandated student achievement 
tests are often reported as scale scores to provide a method for tracking student cohort progress 
on a continuous scale across multiple grade levels, such as Grades 2 through 8. For example, the 
followup studies of the original cohort of STAR children attempt to determine growth or change 
in predicted outcomes over time of children in small classes after taking into account factors such 
as family background, school, and community characteristics (see, for example, Finn, Gerber, 
Achilles, & Boyd-Zaharias, 2001, and Nye, Hedges, & Konstantopolous, 2002). 
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Insert Figure 1 about here 



As can be seen from Figure 1, there were differences in student achievement across the 
state by grade level and by year for each cohort shown and grade level had the greatest effect on 
mean science achievement. It is naturally of interest for teachers, parents, administrators, and 
pohcymakers to be able to use information derived from these differences for program and 
curriculum planning and budgeting for school improvement. A proposed measurement model for 
student-level science achievement might include social backgroimd variables such as family 
income, father’s and mother’s education level, student abihty, motivation, predominate language, 
and race. School-level effects could include organizational features of the school such as class 
size, school size, teacher efficacy, teacher qualifications, school climate, and school expectations. 

Of the ten Tennessee school districts that had the highest Gfrade 5 science achievement in 
1998, six had 100% of schools in the district at or below maximum class size requirements (mean 
= 56%), an indication that in these districts some classes were not made smaller at the expense of 
making other classes larger. All but one of these districts were small districts, with fewer than 
5,000 students; one had 66% of students qualified for free or reduced price meals. Only one of the 
systems had more than the state average of ethnic minority students (mean = 26%). These results 
suggested that although students in communities with large percentages of ethnic minority 
students or students hving in poverty may benefit most from small classes, there were statistically 
significant positive results as well for students in schools with student populations more nearly 
reflective of state averages, a conclusion similar to that of the 2002 Nye, Hedges, and 
Konstantopolous re-analysis of STAR data. Four of the districts were in coimties that were below 
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the state average for per capita income (mean = $15,194). One district with the highest scale 
score achievement for Grade 5 science had only 76% of schools within the district at or below 
class size requirements, better than the state average of 56%, but 11 of its 15 elementary schools 
had lower than predicted value-added gain scores for Grade 5 science. Table 1 provides results of 
hierarchical multiple regression for the effects of class size and school district and community 
demographics on Grade 5 science achievement. 

Insert Table 1 about here 



The results in Table 1 indicate that for a sample of school districts with the highest and 
lowest Grade 5 student science achievement in 1998, class size had a more significant effect than 
ethnicity or fiscal resources such as per capita income and per pupil expenditure. However, per 
pupil expenditure was positively correlated to percentage of classes at or below mandated class 
size (j; = .13, p < .05) and so to this extent both were measure of resources allocated to the 
classroom. Figure 2 provides the conceptual diagram for the path model. 



Insert Figure 2 about here 



A Small Class Example Using Continuing Education Data 
The author has had personal experience with small class instruction for adult learners. In 
each instance the small class configuration was an integral component of the teaching and learning 
experience, with student performance measured as mastery to specific standards or criteria for 
exam preparation or for earning C.E.U.s. For some modules pre-assessments were conducted as 



Compilation of Class Size Findings 10 



needs assessments or for diagnostic purposes only. For these students a value-added, gain, or 
growth model was not appropriate since student achievement measures were essentially 
performance-based assessments and required students to complete tasks or produce a product as 
members of a team. In fact the team participation requirements were such that students who were 
absent for part of the training were usually required to repeat the entire module with another 
group. The design of the instructional modules and nature and complexities of the tasks were such 
that they would have been impossible to complete in the time allotted with a group of more than 
10 to 16 students. The addition of even two or three students reduced the time available for 
teacher-student questioning and answering, brainstorming, and problem-solving activities in which 
each student was expected to participate. In some cases team rapport and cohesion were slower 
to develop, particularly when group members ranged from adults who were LEP with as little as 3 
or 4 years of education to college graduates. For the adult learner configuration, the small group 
model assured that each member’s unique skills and expertise contributed to problem solving, thus 
improving the overall solution (Miller- Whitehead, 1998). For this small class model, results were 
affected directly by the effect of class size on group participation behaviors as well as by the 
indirect effect of class size on student-teacher contact time. 

Table 2 provides a summary of exogenous variables that may have a direct or indirect 
effect on class size results in addition to the more commonly considered student-level and family- 
level variables such as family income, ethnicity, predominate language, and parents’ education. 

Insert Table 2 about here 
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Conclusions 

The Grade 5 science achievement study examined only one small class outcome: i.e., 
student achievement as measured by scale scores on a single test, in this case the TerraNova. For 
this reason it may be of interest to examine additional achievement measures, such as value-added 
or gain scores. This is particularly true for districts where student achievement is either much 
higher or lower than average. Where many students are at the poverty level, scale scores are 
likely to be lower than average while gain scores may be quite high. On the other hand, where 
scale scores are higher than average, gain scores may not exceed predicted percent gains. If these 
same districts have larger classes and lower per pupil expenditures, lowering class size and 
directing more resources to the classroom may improve value-added or gain scores without 
lowering scale scores. 

Student achievement encompasses more than test scores, however. Other published 
studies have found that small classes have been observed to increase time-on-task, hands-on 
activities, social climate, classroom management, student participation, parent involvement, 
diagnosis, space, teacher and student morale, and individual attention given to students. Small 
classes have also been observed to decrease retention, discipline problems, special education 
placement, and stress. National surveys of teachers conducted by the U. S. Department of 
Education have identified class size and student discipline as major concerns, ahead of reducing 
workload and paperwork or raising academic standards (Miller- Whitehead, 2002a). In higher 
education environments, small groups of 10 to 16 students reached consensus in problem-solving 
activities more quickly, were more likely to have all members participate, and to share tasks more 



equally (Miller- Whitehead, 1998). 
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There are many issues that confound class size research that has been conducted outside 
of controlled experimental studies such as STAR. These relate not only to changes of placement 
and attrition of participants over time, but to student participation in a variety of small group 
configurations such as after school and summer programs. A significant number of students may 
be in classes of 22 to 26 during the school day while concurrently participating in tutorials or after 
school and summer programs designed to improve student achievement, thus contaminating 
results of comparisons between S, R, and RA groups (Table 2). These teaching and learning 
configurations may be neither more cost- nor time-efiBcient than placing students in small classes 
during the regular school day. Thus, making comparisons between groups without controlling for 
such exogenous variables may be problematic. 

The results of the present study were that school districts with the highest Grade 5 science 
scale scores were significantly more likely to have all classes within the district at or below 
Tennessee’s mandated class size requirements for each grade level. For example, one high- 
achieving district had higher scale scores for Grade 4 science in its non-Title 1 schools than in its 
Title 1 schools, but the non-Title 1 schools had an average gain of only 90% of that predicted, 
while the Title 1 schools, with mandated CSR, had 1 34% of predicted gain in science 
achievement. Without the use of student-level data it would be difficult if not impossible to 
determine if school districts that had large percentages of oversize classes were providing 
tutorials, computer labs, or other before- and after-school programs that had a significant effect 
on student science achievement. In well-designed large-scale assessments, complex sampling of all 
subpopulations may control for randomly distributed variations in learning conditions under the 
assumption that student participation in such programs is normally distributed across populations. 
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Results further indicated that systemic change had occurred in many school districts that 
had 100% of classes and schools at or below class size requirements and that the change was 
significant enough that the state of Tennessee ceased to issue class size waivers, mandating that 
districts should not exceed maximum allowable class sizes by grade level, subject area, or for 
groups of students in specialized programs. 

When students participate concurrently in more than one learning condition, i.e. regular 
class (RA or R) and after school or summer small class (S), some adjustment should be made to 
control for the time spent in each learning condition. While there are several strategies available 
to the researcher, pre- and posttest results could be used to determine separate effects for each 
class, or random sampling plans could be implemented to assure that these conditions are 
normally distributed across populations. When value-added or gain scores are used, students may 
show greater gains (a different issue than maintaining gains in later years that were made during 
the duration of the experiment) in years that they participate in special programs than they do in 
subsequent years if they cease to receive one-on-one instruction or to participate in before- and 
after-school or summer programs. 

In higher education or continuing education with students of varying backgrounds and 
abilities, small class configurations of one instructor and 10 to 16 students were more effective 
than classes of one instructor and 1 8 to 20 students. The small class groups interacted and 
participated more equally, displayed more effective problem-solving strategies, and were more 
likely to share skills and expertise with group members (Miller- Whitehead, 1998). However, 
larger classes (18 to 20) with two instructors were as effective for adult learners as small classes 
with one instructor. The extant evidence suggests that both PTR and CS provide important 
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information for decision-makers, that funding sources effect class size decisions, and that the 
choice of class size configuration (S, R, RA) may often be determined by scheduling 
considerations and by availability of faculty and facilities. 
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Figure 1. Grade-Level Science Achievement Growth for 1988, 1989, 1990 Kindergarten Cohorts 



Science Achievement Growth 




Note. Scale scores are for Tennessee TerraNova, a component of state-wide accountability. 
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Figure 2. Conceptual model of class size and Grade 5 science achievement. 
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Table 1 

Multiple Regression for Effects of Class Size and Demographics on Grade 5 Science Achievement 



(n=20). 


Indicator 




B 


SEB 


P 


G5 Science 


Class size 


.24 


.11 


.36* 




SES 


-.802 


28.10 


-.01 




Ethnicity 


38.99 


19.41 


.55 




Per capita income 


.001 


.001 


.22 




Per pupil $ 


.006 


.004 


.26 




Adj.R2 = .53 








SES 


Per capita income 


-.00002 


.000 


-.52** 




Ethnicity 


-.58 


.103 


76 *** 




Adj.R2 = .66 









Note. * 2 < -05, ** 2 < .01, *** 2 < 001. Data were a sample of 20 school districts from upper and lower quintile 
for Grade 5 science achievement in 1998. 
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Table 2 

Some Questions Worth Asking When Investigating Class Size Effects 
Question 

1. How many students were enrolled in the class continuously for the entire year? 

2. For how many of these students were pre-test data available? 

3. How many classes that began Small (S) became Regular (R) during the year? 

4. How many classes that began as Regular (R) became Small (S) dxiring the year? 

5. How many students in the small class (S) condition for the entire year were absent enough 

days to affect their achievement, even if they were on roll continuously? 

6. How many students in the Small (S), Regular (R), or RA were in pull-out groups or taught 

in adjacent pods or media centers for some portion of the day or all day? 

7. How many S, R, RA students participated in computer labs, after school, summer 

enrichment, or tutorial programs? How many hours, days, or weeks did they receive 
instruction? In what subject areas? 

8. How many children received extra help fi'om parents at home? Approximately how many 
hours per week? 
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